perm filename 0[0,BGB]1 blob sn#018297 filedate 1973-01-01 generic text, type T, neo UTF8
00100	DRAFT THESIS OUTLINE.					DECEMBER 1972
00200	
00300	                          GEOMETRIC VISION
00400	                      - draft thesis outline -
00500	
00600	                           B. G. Baumgart
00700	
00800	
00900	ABSTRACT:
01000	
01100		This  thesis  is about the design of a computer vision system
01200	based on modeling the mundane physics  of  the  objects  and  scenery
01300	being  viewed.  The  vision  system  discussed  is a function that in
01400	principle can  be  applied  to  a  reel  of  video  tape  to  compute
01500	blueprints  and  geodetic maps. Applications of this system to object
01600	recognition,  scene  analysis   and   robot   vehicle   control   are
01700	demonstrated.
01800	
01900	
02000	CONTENTS:
02100	
02200		I. MEMORY.
02300	
02400		   A. 	Representation of a Geometric Mental Universe.
02500		   B.	Region-Edge Image Representation.
02600		   C.	Semantic, Feature and Predicate Representation.
02700	
02800		II. PROCESS.
02900	
03000		   A.	Image Prediction.
03100		   B.	Image Perception.
03200		   C.	Image Comparison.
03300		   D.	Camera Locus Solution.
03400		   E.	World Model Modification.
03500			   1.	delete object from map.
03600			   2.	add known object to map. (recognition).
03700			   3. 	add or alter object in dictionary.
03800	
03900		III. APPLICATION.
04000	
04100		   A.	Block Scenes.
04200			   1. deletion of a block from a scene.
04300			   2. addition of blocks to a scene.
04400		   B.	Tools and things.
04500			   1. complicated object perception.
04600			   2. known object recognition.
04700		   C.	Robot Vehicle.
04800			   1. known road servoing.
04900			   2. landscape perception.
     

00100	I. MEMORY STRUCTURE.
00200	
00300		In order to get a computer to deal with the physical world it
00400	must  have  a  data  representation  on  which computations involving
00500	space, time, shape, size and the appearance of things can be done. In
00600	this  section,  a  representation  for  the  topology,  geometry  and
00700	photometry of everyday things  and  scenes  is  explained.  The  data
00800	structures  discussed  are  implemented  as  small  blocks  of  words
00900	containing pointers and data in the fashion  usual  to  graphics  and
01000	simulation;  an introduction to this technology can be found in Knuth
01100	[1]; and although the language of implementation  is  PDP-10  machine
01200	code,  the  data  and  functions  presented below are accessible from
01300	higher level languages like LISP and ALGOL.
01400	
01500	I.A. Representation of a Geometric Mental Universe.
01600	
01700		At the top of the data structure is a  single  universe  node
01800	from  which  everything  else can be reached.   Immediately below the
01900	universe node is a ring  of  world  models.   A  robot  dealing  with
02000	physical world sensor input, such as video data, has one of its world
02100	models dedicated to simulating  the  immediate  here  and  now;  this
02200	mental  world  is  called the reality world model. In addition to the
02300	reality world, a robot may have  fantasy  world  models  for  problem
02400	solving, planning or for recalling platonic object prototypes. In the
02500	following, a two world mental universe will be the most common,  with
02600	the  reality world being referred to as a "map" and the fantasy world
02700	being referred to as a "dictionary".
02800	
02900		Geometric world models have four  basic  kinds  of  nodes:
03000	body, face, edge and vertex. The face, edge and vertex nodes are used
03100	to form polyhedrons which may be attached to body nodes.  Body  nodes
03200	in  turn  are  connected  to  each other in rings and trees to form a
03300	world model. Additional kinds of nodes  discribe  cameras  and  light
03400	sources  as  well  as  temporary  data  such  as shadows, spines, and
03500	trajectories.
03600	
03700		...continuation of this section follows AIM-179,
03800		"Winged Edge Polyhedron Representation" - Baumgart.
     

00100	I.B. Region-Edge Image Representation.
00200	
00300		The image data structure  presented  in  this  section  is  a
00400	computer's  internal  notation  for  what  is  vulgarly called a line
00500	drawing; the common term is misleading because it  does  not  suggest
00600	the  equally  important  space between the lines; terms closer to the
00700	idea would be "mosaic drawing" or "stained glass window drawing".
00800	
00900	The  data  structure  has  main  levels:  TV  raster,  video
01000	intensity contour, arc contour, and region-edge.
     

00100		...continuation of this section follows SAILON-71,
00200		"CART'S EYE THREE and its IMAGE REPRESENTATION" - Baumgart.
00300	
00400	
     

00100	II. PROCESS.
00200	
00300	   A.	Image Prediction.
00400	   B.	Image Perception.
00500	   C.	Image Comparison.
00600	   D.	Camera Locus Solution.
00700	   E.	World Model Modification.
00800		   1.	delete object from map.
00900		   2.	add known object to map. (recognition).
01000		   3. 	add or alter object in dictionary.
01100	
01200	III. APPLICATION.
01300	
01400	   A.	Block Scenes.
01500		   1. deletion of a block from a scene.
01600		   2. addition of blocks to a scene.
01700	   B.	Tools and things.
01800		   1. complicated object perception.
01900		   2. known object recognition.
02000	   C.	Robot Vehicle.
02100		   1. known road servoing.
02200		   2. landscape perception.